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Abstract 

The Neyman and Scott (1948) model is widely used to demonstrate 
a serious weakness of the Maximum Likelihood (ML) method: it can 
give rise to inconsistent estimators. The primary objective of this paper 
is to revisit this example with a view to demonstrate that the culprit 
for the inconsistent estimation is not the ML method but an ill-defined 
statistical model. It is also shown that a simple recasting of this model 
renders it well-defined and the ML method gives rise to consistent and 
asymptotically efficient estimators. 
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1 Introduction 

Despite claims for priority by a number of different authors, Maximum Likeli- 
hood (ML) was first articulated as a general method of estimation in the context 
of a parametric statistical model by Fisher (1922). It took several decades to 
establish the regularity conditions needed to ensure the key asymptotic prop- 
erties of ML Estimators (MLE), such as efficiency and consistency (Cramer, 
1946, Wald, 1949), but since then ML has dominated estimation in frequentist 
statistics; see Stigler (2007), Hald (2007) for this history. 

Several counterexamples were proposed in the 1940s and 1950s raising doubts 
about the generality of the ML method. These counterexamples include Hodges 's 
local superefficient estimator (Le Cam, 1953), the mixture of two Normal distri- 
butions (Cox, 2006)) and the inconsistent MLE example proposed by the Ney- 
man and Scott (1948) model. Commenting on these examples Stigler (2007), p. 
613, argued that none are considered serious enough to undermine the credibility 
of the ML method, and singled out the last example: 

"The Wald-Neyman-Scott example was of more practical import, and still serves 
as a warning of what might occur in modern highly parameterized problems, where 
the information in the data may be spread too thinly to achieve asymptotic consis- 
tency." 

The primary objective of this note is to revisit the Neyman-Scott example 
with a view to unpack Stigler's assessment by demonstrating that the real culprit 
for the inconsistent estimator is not the ML method, as such, but an ill-defined 
statistical model. It is also shown that a simple recasting of this model renders 
it well-defined and the ML method gives rise to consistent and asymptotically 
efficient estimators. 



2 The Neyman-Scott model 



The quintessential example used to demonstrate that ML might give rise to 
inconsistent estimators is the Neyman-Scott model: 



^it — IJ't + ^iti 
£it-NIID(0,<T2)=0, 



i=l, 2, t=l, 2, n, ... 



which can be viewed as a simple time effects panel data model. 

The underlying distribution of the observable random variables {Xn, X2t) is 
bivariate Normal Independent, but not Identically Distributed: 



X,:= 



Xit 

X2t 



N\ 



a 








t=h2, 



In light of the fact that the non-ID assumption implies that this model 
suffers from the incidental parameter problem, in the sense that the unknown 
parameters (/ii,/i2, increase with the sample size, the latter are viewed 

as nuisance parameters, and cr^ as the only parameter of interest. 
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3 Maximum Likelihood Estimation? 



Despite the incidental parameter problem the Neyman-Scott model continues to 
be used as a counterexample to the ML method. That is, the literature ignores 
the incidental parameter problem, and defines the distribution of the sample to 
be: 

This gives rise to the 'log-likelihood function': 



\nL{e;x) = - n In 0-2- Y.7=iii^u-f^tf + (a;2t-/^t)^]- 

The 'Maximum Likelihood Estimators' (MLE) are supposed to be derived by 
solving the first-order conditions: 

2^ Et"=i[(a;it - Mf)^ + {X2t - Mf)^]=0, 
llt=^{Xu + X2t), t=l,2,...,n, 

n 

2 _i_ /' v.. _ -rr ^211 „2 



dlnL{0;x) 



81nL(6>;x) 

giving rise to: 



^2 I /- v.. _ ^2l 



where s|=i[(Xit - ^tV + (^2^- 

Notice that for \nL{d;x), dMLE'-=ifit^^'^ ,t=\,2, ...,n) is a maximum since 
the second derivatives at 6 =6 are: 



\nL(0;x) 



0=0^ 



<o, 



lnL(6';x) 



0=e^ 



lnL(e;x) 



and thus: 



lnL(6>;x) \ / \nL(9;x) \ _ ( 8^ lnL(e;x) 



> 0. 



= -^<o, 



9 — 9 MLE 

The commonly used argument against the ML method is that since: 
E{V-t)=tJ-t and E{s'l)=^a'^, 
it follows that the 'MLE' is both biased and inconsistent because: 



i^(^?')=^ELl^(-?)=^^^ and a^"4■ia^ 
since the bias E{a^)—a'^=—^a'^ does not go to as n— >-oo. 



3 



A moment's reflection reveals that the inconsisteney argument is ill-thought 
out. The is because the incidental parameter problem renders /Xj=^(Xi( +X2t) 
an inconsistent estimator of n^, for t=l, 2, n; there are only two observations 
for each /ij, and thus their variance Var{'p,f)=^a'^ does not go to zero as n ^ oo. 

What the critics of the ML method do not appreciate enough is the fact that 
treating the unknown parameters (/z^, /Li2, as incidental and designating 
(T^ the only parameter of interest, does not let the statistician 'off the hook'. 
This is because the parameter of interest (T^=£'(Xi(— /xj^, defining the variation 
around /itj, invokes the incidental parameters. Put more intuitively, when the 
data come in the form of Zo:=(zi, Z2; z„), Zt:={xit,X2t), one can get to cr^ 
via /Xj, and using /ij leads to problems because it is an inconsistent estimator. 
In that sense (Severini, 2000): 

"... this model falls outside of the general framework we are considering since 
the dimension of the parameter (/i^,/x2, ...,/u„,cr^) depends on the sample size." 
(p. 108) 

n 

That is, calling =j- E [(^it-Mt)^+(^2t-/it)^] a MLE is highly misleading 

since the ML method was never meant to be applied to statistical models whose 
number of unknown parameters increases with the sample size n. 

In truth, one should be very skeptical of any method of estimation which 
yields consistent estimators in cases where the statistical model in question is 
ill- defined, as in the case of the incidental parameter problem. Hence, the more 
interesting question should be: 

why would the ML method yield a consistent estimator of cr^? 

The fact that the ML method does not yield a consistent estimator of ct^ should 
count in its favor not against it! To paraphrase Stigler's quotation: the ML 
method 'warns the modeler that the information in the data has been spread too 
thinly'. 

4 Recasting the original Neyman-Scott model 

The question that naturally arises is: can one respeciiy the above statistical 
model to render it well-defined but retaining the parameter of interest? The 
answer is surprisingly straightforward. Since the incidental parameter problem 
arises because of the unknown but t-varying means (/ij^, /X2, /i„), one can 
rc-specify the original bivariate model into a univariate simple Normal (one 
parameter), using the transformation: 

Yt=^{Xu - X2t) - NMD (0,a2) , i=l,2, ...,n, 

Var{Yt) = \[Var{Xu) + Var{X2t)\=(T'' ■ 

This is a sensible thing to do because taking the difference eliminates the nui- 
sance parameters (/U^, /i2) Mn)' without affecting the parameter of interest. 
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For this simple Normal model, the MLE for is: o\ii^E=\ S"=i ^ 
which it is unbiased, fully efficient and strongly consistent: 

E{(Tmle)=(^ , Var{(rMLE)=^, (^mle ^ • 

Notes : 

(i) The above recasting of the Ncyman-Scott model can be easily extended 
to the case {Xii,X2l, X^t) , 2 < m < n. 

(ii) Hald (2007), p. 182-3 offers an alternative, highly original, way to 
sidestep the incidental parameter problem using Fisher's two stage ML method. 

5 Conclusion 

The main conclusion from the above discussion is that when the ML method 
gives rise to inconsistent estimators, the modeler should take a closer look at 
the assumed statistical model; chances are, it is ill-defined. This is particularly 
true in the case where the assumed model suffers from the incidental parameter 
problem. In such cases the way forward is to recast the original model to render 
it well-defined and then apply the ML method. This argument is illustrated 
above using the Neyman-Scott (1948) model. 
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